Skip to content

feat: stream Bash progress and guard message actions during replies#2879

Merged
zhayujie merged 3 commits into
zhayujie:masterfrom
yangziyu-hhh:master
Jun 12, 2026
Merged

feat: stream Bash progress and guard message actions during replies#2879
zhayujie merged 3 commits into
zhayujie:masterfrom
yangziyu-hhh:master

Conversation

@yangziyu-hhh

Copy link
Copy Markdown
Contributor

What does this PR do?

This PR adds bounded streaming progress for long-running Bash tool executions
in the web console and prevents message edit/delete operations while the
current session is generating a reply.

Key improvements:

  • Long-running Bash commands now show recent output snapshots in the existing
    tool card
  • Fast burst-output commands finish normally without producing unnecessary
    progress events
  • Bash progress supports stdout, stderr, and output without newline characters
  • Live progress is replaced by the authoritative final tool result to avoid
    duplicate output
  • Edit and Delete actions are disabled while the current session is generating
    a reply
  • Edited messages correctly refresh the input state and can be submitted

Changes

🖥️ Web Console

Streaming Tool Output

  • Shows bounded Bash progress snapshots in the existing tool output area
  • Associates tool start, progress, and end events through stable
    tool_call_id values
  • Automatically expands a tool card when progress becomes available
  • Replaces the live snapshot with the final tool result when execution ends
  • Uses textContent for streamed output to prevent HTML injection

Message Action Guards

  • Disables Edit and Delete while the current session is generating a reply
  • Keeps restrictions scoped to the current session, without blocking idle
    sessions when another session runs in the background
  • Restores Edit and Delete after completion, cancellation, or failure
  • Displays contextual disabled-state tooltips
  • Refreshes textarea size and Send button state after clicking Edit

🛠️ Tool Execution

  • Adds an optional best-effort progress callback to BaseTool
  • Emits tool_execution_progress events from AgentStreamExecutor
  • Clears the progress callback after every tool execution to prevent
    cross-request event leaks
  • Preserves existing behavior for tools that do not report progress

💻 Bash Tool

  • Replaces the main blocking execution path with subprocess.Popen
  • Continuously drains stdout and stderr using background reader threads
  • Reads byte chunks instead of lines, supporting output without newline
    characters
  • Starts publishing progress only after a short delay
  • Publishes changed recent-output snapshots at controlled intervals
  • Keeps the progress buffer bounded to avoid high memory usage
  • Avoids emitting progress for commands that finish quickly
  • Redacts common secrets and configured environment values from streamed
    progress
  • Preserves existing timeout, exit-code, truncation, retry, and final-output
    behavior

🐛 Bug Fixes

  • Fixed edited messages leaving the Send button disabled
  • Fixed Edit being available while a reply was still being generated
  • Fixed Delete being available while a reply was still being generated
  • Prevented live progress from being displayed separately from the final
    output
  • Prevented fast high-volume commands from flooding SSE with progress events

Files Changed

File Changes
agent/tools/base_tool.py Add optional best-effort progress callback API
agent/protocol/agent_stream.py Emit and clean up tool progress events
agent/tools/bash/bash.py Add bounded delayed Bash progress snapshots
channel/web/web_channel.py Forward tool progress and tool_call_id through SSE
channel/web/static/js/console.js Render progress and guard Edit/Delete actions
channel/web/static/css/console.css Add streaming output and disabled action styles

Backward Compatibility

✅ Tools that do not report progress continue to work unchanged
✅ Fast Bash commands continue to use the existing start/end behavior
✅ Existing Bash final result, timeout, exit-code, and truncation behavior is preserved
✅ Message action restrictions only apply while the current session has an active reply
✅ Existing sessions and persisted messages remain compatible

Testing

  • Python compilation checks
  • JavaScript syntax check with node --check
  • git diff --check
  • Agent event order: start → progress → end
  • Fast 10,000-line burst command emits no progress events
  • Long-running staged command emits periodic progress snapshots
  • Output without newline characters is streamed
  • Common secrets are redacted from progress output
  • Final tool result replaces live progress output
  • Edit and Delete are disabled during active replies
  • Edit and Delete are restored after the reply ends
  • Edited messages can be submitted normally
  • Current-session restrictions do not block unrelated background sessions

Type of change

  • Bug fix
  • New feature
  • Documentation upd

@zhayujie

Copy link
Copy Markdown
Owner

Thanks for the contribution! The feature direction is great — live Bash output is a real UX improvement. However, a few issues need to be addressed before this can be merged:

1. Tool card expand/collapse behavior (UX)

  • On tool_end, please remove the expanded class so the card returns to its default collapsed state, consistent with the current behavior (auto-collapse after execution; user clicks to expand).
  • The tool_progress handler calls classList.add('expanded') on every event, so if the user manually collapses the card during execution it keeps popping back open. Please expand only on the first progress event and respect the user's manual toggle afterwards.

2. Process tree is not killed on Windows (functional defect)

In _kill_process, the Windows branch only calls process.kill(), which kills cmd.exe itself but not its descendants (e.g. pip, npm). Orphaned children keep the stdout pipe open, which directly triggers the hang described in issue 3. Please use taskkill /F /T /PID on Windows to kill the whole process tree, matching the os.killpg behavior on POSIX:

if self._IS_WIN:
    subprocess.run(["taskkill", "/F", "/T", "/PID", str(process.pid)], capture_output=True)

3. reader.join() without timeout can hang forever (critical)

If the command spawns a background process (e.g. nohup xxx &, or start /b xxx on Windows) that keeps stdout/stderr open, os.read never returns EOF after the main process exits, so reader.join() in the finally block blocks forever. The entire tool-execution thread hangs and the agent session is stuck permanently. The old subprocess.run path did not hang the calling thread, so this is a regression. Please use reader.join(timeout=5) — the readers are daemon threads, so abandoning them is safe and no already-captured output is lost.

4. Frontend memory / replay buffer growth (performance)

Each progress event can carry up to 32KB, emitted every 0.5s, and every event is appended to the buffer.items replay buffer (capped at 5000 entries, not bytes). A long-running command could grow browser memory to 100MB+. Suggestions: send only the last ~4KB (or last N lines) per snapshot, and keep only the latest progress event per tool_call_id in the replay buffer.

5. i18n inconsistency (minor)

The disabled-state tooltips are hardcoded English strings ("Reply is being generated; editing is temporarily unavailable."). Please route them through the existing t() / I18N system and add Chinese translations.

6. Empty output still renders the "Output" label (minor)

On tool_end, the "Output" label is rendered even when result is empty. The old logic skipped the output section when there was no result — please keep that behavior.

7. Tests (recommend)

This PR rewrites the core execution path of the Bash tool but ships no automated tests. Please add unit tests covering: fast commands, timeout kill, commands spawning background processes, and output without trailing newlines. Also, the test checklist in the PR description doesn't mention Windows — please verify timeout and long-running command scenarios on a real Windows environment.

@yangziyu-hhh

Copy link
Copy Markdown
Contributor Author

Thanks for the contribution! The feature direction is great — live Bash output is a real UX improvement. However, a few issues need to be addressed before this can be merged:

1. Tool card expand/collapse behavior (UX)

  • On tool_end, please remove the expanded class so the card returns to its default collapsed state, consistent with the current behavior (auto-collapse after execution; user clicks to expand).
  • The tool_progress handler calls classList.add('expanded') on every event, so if the user manually collapses the card during execution it keeps popping back open. Please expand only on the first progress event and respect the user's manual toggle afterwards.

2. Process tree is not killed on Windows (functional defect)

In _kill_process, the Windows branch only calls process.kill(), which kills cmd.exe itself but not its descendants (e.g. pip, npm). Orphaned children keep the stdout pipe open, which directly triggers the hang described in issue 3. Please use taskkill /F /T /PID on Windows to kill the whole process tree, matching the os.killpg behavior on POSIX:

if self._IS_WIN:
    subprocess.run(["taskkill", "/F", "/T", "/PID", str(process.pid)], capture_output=True)

3. reader.join() without timeout can hang forever (critical)

If the command spawns a background process (e.g. nohup xxx &, or start /b xxx on Windows) that keeps stdout/stderr open, os.read never returns EOF after the main process exits, so reader.join() in the finally block blocks forever. The entire tool-execution thread hangs and the agent session is stuck permanently. The old subprocess.run path did not hang the calling thread, so this is a regression. Please use reader.join(timeout=5) — the readers are daemon threads, so abandoning them is safe and no already-captured output is lost.

4. Frontend memory / replay buffer growth (performance)

Each progress event can carry up to 32KB, emitted every 0.5s, and every event is appended to the buffer.items replay buffer (capped at 5000 entries, not bytes). A long-running command could grow browser memory to 100MB+. Suggestions: send only the last ~4KB (or last N lines) per snapshot, and keep only the latest progress event per tool_call_id in the replay buffer.

5. i18n inconsistency (minor)

The disabled-state tooltips are hardcoded English strings ("Reply is being generated; editing is temporarily unavailable."). Please route them through the existing t() / I18N system and add Chinese translations.

6. Empty output still renders the "Output" label (minor)

On tool_end, the "Output" label is rendered even when result is empty. The old logic skipped the output section when there was no result — please keep that behavior.

7. Tests (recommend)

This PR rewrites the core execution path of the Bash tool but ships no automated tests. Please add unit tests covering: fast commands, timeout kill, commands spawning background processes, and output without trailing newlines. Also, the test checklist in the PR description doesn't mention Windows — please verify timeout and long-running command scenarios on a real Windows environment.

Thanks for the detailed review. I’ve addressed all requested changes:

  • Tool cards now expand only on the first progress event, respect manual collapse, and automatically collapse on tool_end.
  • Windows process termination now uses taskkill /F /T /PID to terminate the complete process tree, with fallback handling.
  • Reader threads use a bounded shared join timeout to prevent background processes holding pipes open from blocking indefinitely.
  • Progress snapshots are limited to 4 KiB, and the replay buffer retains only the latest progress event per tool_call_id.
  • Disabled Edit/Delete tooltips now use the existing English and Chinese i18n translations.
  • Empty results no longer render an Output section.
  • Added automated tests covering fast commands, timeout handling, background processes, output without trailing newlines, and Windows process-tree termination.

I also added a windows-latest GitHub Actions workflow and verified the timeout and long-running command scenarios successfully on a real Windows runner. All Windows Bash streaming tests passed.

@zhayujie

Copy link
Copy Markdown
Owner

thanks for the quick turnaround! Verified all the fixes locally, the bounded reader.join(), taskkill /T process-tree termination, 4KB snapshots, replay buffer dedup, i18n, and the collapse behavior all look good, and the test suite passes on macOS. Nice work on the Windows tests as well

merged. Thanks again for the contribution! 🎉

@zhayujie zhayujie merged commit 80d0a6a into zhayujie:master Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants